Analysing and Classifying Names of Chemical Compounds with CHEMorph
نویسندگان
چکیده
We present a prototypical system with a purely linguistic method to analyse organic chemical compound names. It morpho-semantically analyses compound names, generates line-based, machinereadable representations of their corresponding molecular structures (SMILES strings), and triggers a taxonomic classification. CHEMorph is to be used to support manual database curation and as a basis for biochemical text processing. The system is written in Prolog.
منابع مشابه
Analysing Names of Organic Chemical Compounds -- From Morpho-Semantics to SMILES Strings and Classes (Web Version)
The linguistic analysis of chemical terminology is a key to biochemical text processing and semi-automatic database curation. The system described analyses systematic and semi-systematic names of chemical compounds, class terms, and also otherwise underspecified names by means of a morpho-semantic grammar developed according to IUPAC nomenclature. It yields an intermediate semantic representati...
متن کاملسیستم شناسایی و طبقه بندی اسامی در متون فارسی
Name entity recognition (NER) is a system that can identify one or more kinds of names in a text and classify them into specified categories. These categories can be name of people, organizations, companies, places (country, city, street, etc.), time related to names (date and time), financial values, percentages, etc. Although during the past decade a lot of researches has been done on NER in ...
متن کاملIdentifying and Classifying Terms in the Life Sciences: The Case of Chemical Terminology
Facing the huge amount of textual and terminological data in the life sciences, we present a theoretical basis for the linguistic analysis of chemical terms. Starting with organic compound names, we conduct a morpho-semantic deconstruction into morphemes and yield a semantic representation of the terms’ functional and structural properties. These semantic representations imply both the molecula...
متن کاملPhenolic compounds as chemical markers of low taxonomic levels in the marine algal genus Laurencia in the Persian Gulf
The genus Laurencia(Rhodomelaceae), a complex group, has 285 species and infraspecific names. Identification and taxonomy of these taxa, mainly has been based on flexible morphological characters which have led to a complicated taxonomy in this group. Nowadays, taxonomical study of this group has changed a lot by using reproductive characters, anatomical differences and modern...
متن کاملeADMIUM AN eADMIUM COMPOUNDS
Synonyms, trade names and molecular formulae for cadmium, cadmium-copper alloy and some cadmium compounds are presented in Thble 1. The cadmium compounds shown are those for which data on carcinogenicity or mutagenicity were available or which are commercially important compounds. It is not an exhaustive list and does not necessarily include all of the most commercially important cadmium-contai...
متن کامل